Journal of Chemical Theory and Computation — Latest Matching Preprints

1

From gHBfix to NBfix: Reweighting-Driven Refinement of Hydrogen-Bond Interactions in RNA Force Fields

Mlynsky, V.; Kuehrova, P.; Bussi, G.; Otyepka, M.; Sponer, J.; Banas, P.

2026-03-21 biophysics 10.64898/2026.03.20.713292 medRxiv

Top 0.1%

76.9%

Show abstract

Understanding RNA structural dynamics is essential for elucidating its biological functions, and molecular dynamics (MD) simulations provide an important atomistic complement to experimental approaches. However, the predictive power of MD is fundamentally limited by the accuracy of the underlying empirical Force Fields (FFs), particularly in capturing the delicate balance of non-bonded interactions. Here, we present a systematic reparameterization strategy that replaces the external gHBfix19 hydrogen-bond (H-bond) correction potential with an equivalent set of NBfix Lennard-Jones modifications within a state-of-the-art RNA FF. Using a quantitatively converged temperature replica-exchange MD ensemble of the GAGA tetraloop, we employed a reweighting-based optimization protocol to derive NBfix parameters that reproduce the thermodynamic effects of the original gHBfix19 terms. Sequential optimization of individual gHBfix19 components proved essential to ensure stable and transferable parameter refinement. The resulting fully reformulated NBfix-based variant, termed OL3CP-NBfix19, was validated on a representative set of RNA motifs, including tetranucleotides, A-form duplexes, and tetraloops. Across all tested systems, its performance is comparable to that of the reference gHBfix19 FF. By embedding the H-bond corrections directly into the standard non-bonded framework, the NBfix formulation eliminates external biasing potentials, simplifies practical deployment, and reduces computational overhead. Beyond this specific reparameterization, our results demonstrate a practical workflow for translating targeted H-bond corrections into native FF terms for efficient biomolecular simulations.

2

Proteus software for physics-based protein design

Mignon, D.; Druart, K.; Opuu, V.; Polydorides, S.; Villa, F.; Gaillard, T.; Michael, E.; Archontis, G.; Simonson, T.

2020-07-01 biophysics 10.1101/2020.06.30.179549 medRxiv

Top 0.1%

65.7%

Show abstract

We describe methods and software for physics-based protein design. The folded state energy combines molecular mechanics with Generalized Born solvent. Sequence and conformation space are sampled with Replica Exchange Monte Carlo, assuming one or a few fixed protein backbone structures and discrete side chain rotamers. Whole protein design and enzyme design are presented as illustrations. Full redesign of three PDZ domains was done using a simple, empirical, unfolded state model. Designed sequences were very similar to natural ones. Enzyme redesign exploited a powerful, adaptive, importance sampling approach that allows the design to directly target substrate binding, reaction rate, catalytic efficiency, or the specificity of these properties. Redesign of tyrosyl-tRNA synthetase stereospecificity is reported as an example.Competing Interest StatementThe authors have declared no competing interest.View Full Text

3

Transferability of ion force fields to OPC water: Maintaining single-ion and ion-pairing properties

Wiebeler, C.; Falkner, S.; Schwierz, N.

2026-04-02 biophysics 10.64898/2026.03.31.715553 medRxiv

Top 0.1%

61.0%

Show abstract

Accurate ion force fields are essential for molecular dynamics simulations of biomolecular systems, particularly in combination with modern water models such as OPC. While OPC water improves the description of bulk water and biomolecules, the transferability of existing ion force fields to this model remains an open question. Here, we systematically assess the transferability of monovalent and divalent ion force field parameters (Li+, Na+, K+, Cs+, Mg2+,Ca2+, Sr2+, Ba2+, Cl- and Br-) to OPC water by comparing single-ion and ion-pairing properties with experimental data. Our analysis reveals that no single literature parameter set provides accurate results for all ions when directly transferred to OPC water. We hence introduce the MS/G-LB(OPC) force field, which combines Mamatkulov-Schwierz-Grotz cation parameters with Loche-Bonthuis anion parameters. MS/G-LB(OPC) reproduces hydration free energies, first-shell structural properties and activity derivatives at low salt concentrations. Our results demonstrate that transferring ion parameters to OPC can lead to significant and ion-specific deviations from experimental data, making careful validation essential. At the same time, the systematic transfer and combination of ion parameters from existing force fields can provide a practical and computationally efficient alternative to full reparameterization. MS/G-LB(OPC) is available at https://git.rz.uni-augsburg.de/cbio-gitpub/opc-ion-force-fields.

4

Rare-Event Sampling using a Reinforcement Learning-Based Weighted Ensemble Method

Yang, D.; Goldberg, A.; Chong, L.

2024-10-11 biophysics 10.1101/2024.10.09.617475 medRxiv

Top 0.1%

60.7%

Show abstract

Despite the power of path sampling strategies in enabling simulations of rare events, such strategies have not reached their full potential. A common challenge that remains is the identification of a progress coordinate that captures the slow relevant motions of a rare event. Here we have developed a weighted ensemble (WE) path sampling strategy that exploits reinforcement learning to automatically identify an effective progress coordinate among a set of potential coordinates during a simulation. We apply our WE strategy with reinforcement learning to three benchmark systems: (i) an egg carton-shaped toy potential, (ii) an S-shaped toy potential, and (iii) a dimer of the HIV-1 capsid protein (C-terminal domain). To enable rapid testing of the latter system at the atomic level, we employed discrete-state synthetic molecular dynamics trajectories using a generative, fine-grained Markov state model that was based on extensive conventional simulations. Our results demonstrate that using concepts from reinforcement learning with a weighted ensemble of trajectories automatically identifies relevant progress coordinates among multiple candidates at a given time during a simulation. Due to the rigorous weighting of trajectories, the simulations maintain rigorous kinetics.

5

NEAT-DNA: A Chemically Accurate, Sequence-Dependent Coarse-Grained Model for Large-Scale DNA Simulations

Riveros, I.; Zhang, B.

2025-11-08 biophysics 10.1101/2025.11.07.687145 medRxiv

Top 0.1%

60.4%

Show abstract

DNAs physical properties play a central role in genome organization and regulation, but simulating its behavior across biologically relevant scales remains a major computational challenge. Coarse grained DNA models have enabled faster simulations, yet they often sacrifice chemical accuracy or produce unphysical conformations, limiting their utility for studying genome structure. A key difficulty has been constructing a model that is both efficient enough for large-scale simulations and faithful to the molecular mechanics of DNA. Here we introduce NEAT-DNA, a new coarse-grained DNA model that resolves longstanding limitations in physical realism and parameter optimization. By combining a physically principled energy formulation with a unified training framework that integrates data from both atomistic simulations and experiments, NEAT-DNA accurately reproduces sequence-dependent structure and flexibility while remaining computationally efficient. This approach marks a significant advance over previous models, which either lacked sequence specificity or introduced distortions inconsistent with experimental observations. NEAT-DNA bridges this gap, offering a high-fidelity yet tractable representation of DNA suitable for exploring chromatin folding. More broadly, it provides a foundation for large-scale simulations that couple molecular detail with gene-level chromatin organization, opening new avenues for predictive modeling in structural genomics.

6

Efficient RNA Folding Simulation via a Structure-Based Single-Site-Per-Nucleotide Model

Thornton, T.; Lin, X.

2025-12-14 biophysics 10.64898/2025.12.13.694107 medRxiv

Top 0.1%

60.3%

Show abstract

Computational modeling of large RNA structures and their dynamics is essential for uncovering the molecular mechanisms underlying various genomic processes and RNA-regulated cellular functions. Residue-resolution modeling is an effective approach for simulating large biomolecular structures while preserving essential sequence and struc-tural features presented in atomic structures. Here, we implemented a structure-based single-site-per-nucleotide (SSPN) RNA model using the GPU-accelerated OpenMM 1 software and evaluated its computational efficiency and accuracy by simulating RNA hairpins unfolding under force. Our simulations compare favorably with an earlier, more detailed RNA model and quantitatively reproduce experimentally measured ther-modynamic properties of RNA under mechanical stretching. This SSPN model enables scalable and accurate simulations of large RNA ensembles, such as long non-coding RNAs and RNA liquid-liquid phase separation.

7

An essential dynamics-based elastic network model to unravel the conformational dynamics of DNA, RNA, and protein-nucleic acid complexes

Cannariato, M.; Scaramozzino, D.; Lee, B. H.; Deriu, M. A.; Orellana, L.

2026-03-13 biophysics 10.64898/2026.03.11.710985 medRxiv

Top 0.1%

60.1%

Show abstract

The flexibility of DNA and RNA is known to play a central role in numerous biological processes, including chromatin organization and gene regulation. While a wide range of computational approaches have been developed to investigate the conformational dynamics and flexibility of proteins, analogous methods for nucleic acids remain comparatively underexplored. Elastic Network Models (ENMs) - coarse-grained mechanical representations in which macromolecules are modeled as networks of nodes connected by elastic springs - have been successfully applied to proteins, often allowing to capture experimentally observed conformational changes through a small number of harmonic normal modes. Building on a previously validated three-bead ENM for RNA, here we introduce edENM, an essential dynamics-refined ENM for DNA, RNA, and protein-nucleic acid complexes, parametrized using a diverse set of Molecular Dynamics simulations. The vibrational modes of the new edENM show good agreement with NMR data and experimental ensembles, while avoiding the unrealistic and localized deformability of previous ENM parametrizations. Additionally, we integrated this new edENM into eBDIMS, a Brownian Dynamics-based framework that enables the simulation of large-scale and anharmonic conformational transitions in protein assemblies. In this way, we are now able to explore functional motions in large protein-nucleic acid complexes such as chromatin subunits and ribosomes.

8

DiffPIE: Guiding Deep Generative Models to Explore Protein Conformations under External Interactions

Wang, Y.; Chen, M.

2025-04-30 biophysics 10.1101/2025.04.27.650875 medRxiv

Top 0.1%

59.5%

Show abstract

In recent years, many foundation generative models have been developed to pre-dict structures of molecules and materials. Although these foundation models have achieved great success, it is challenging to collect enough data to train foundation generative models. One such example is to predict protein conformations with protein-environmental interactions (PEI), such as interactions introduced by organic linkers or material surfaces. We propose a physics-guided route to extrapolate foundation mod-els beyond their training domain. Our method couples a pretrained deep generative model with explicit, physics-based interaction potentials for PEI, steering sampling to-ward conformations consistent with external constraints without any retraining or fine-tuning. We demonstrate accurate and efficient conformation prediction of (i) cyclic peptide with organic linkers and (ii) peptide adsorbed on the gold surface. The gen-erated structures serve as high-quality initial conditions for downstream simulations, providing a general, systematic approach to extend foundation models to proteins under system-specific environmental interactions.

9

Fine-tuning molecular mechanics force fields to experimental free energy measurements

Rufa, D.; Fass, J.; Chodera, J. D.

2025-01-08 biophysics 10.1101/2025.01.06.631610 medRxiv

Top 0.1%

56.3%

Show abstract

Alchemical free energy methods using molecular mechanics (MM) force fields are essential tools for predicting thermodynamic properties of small molecules, especially via free energy calculations that can estimate quantities relevant for drug discovery such as affinities, selectivities, the impact of target mutations, and ADMET properties. While traditional MM forcefields rely on hand-crafted, discrete atom types and parameters, modern approaches based on graph neural networks (GNNs) learn continuous embedding vectors that represent chemical environments from which MM parameters can be generated. Excitingly, GNN parameterization approaches provide a fully end-to-end differentiable model that offers the possibility of systematically improving these models using experimental data. In this study, we treat a pretrained GNN force field--here, espaloma-0.3.2--as a foundation simulation model and fine-tune its charge model using limited quantities of experimental hydration free energy data, with the goal of assessing the degree to which this can systematically improve the prediction of other related free energies. We demonstrate that a highly efficient "one-shot fine-tuning" method using an exponential (Zwanzig) reweighting free energy estimator can improve prediction accuracy without the need to resimulate molecular configurations. To achieve this "one-shot" improvement, we demonstrate the importance of using effective sample size (ESS) regularization strategies to retain good overlap between initial and fine-tuned force fields. Moreover, we show that leveraging low-rank projections of embedding vectors can achieve comparable accuracy improvements as higher-dimensional approaches in a variety of data-size regimes. Our results demonstrate that linearly-perturbative fine-tuning of foundation model electrostatic parameters to limited experimental data offers a cost-effective strategy that achieves state-of-the-art performance in predicting hydration free energies on the FreeSolv dataset.

10

Simple Adjustment of Intra-nucleotide Base-phosphate Interaction in OL3 AMBER Force Field Improves RNA Simulations

Mlynsky, V.; Kuhrova, P.; Stadlbauer, P.; Krepl, M.; Otyepka, M.; Banas, P.; Sponer, J.

2023-09-06 biophysics 10.1101/2023.09.05.556403 medRxiv

Top 0.1%

56.2%

Show abstract

Molecular dynamics (MD) simulations represent an established tool to study RNA molecules. Outcome of MD studies depends, however, on the quality of the used force field (ff). Here we suggest a correction for the widely used AMBER OL3 ff by adding a simple adjustment of nonbonded parameters. The reparameterization of Lennard-Jones potential for the -H8...O5- and -H6...O5- atom pairs addresses an intra-nucleotide steric clash occurring in the type 0 base-phosphate interaction (0BPh). The non-bonded fix (NBfix) modification of 0BPh interactions (the NBfix0BPh modification) was tuned via reweighting approach and, subsequently, tested using extensive set of standard and enhanced sampling simulations of both unstructured and folded RNA motifs. The modification corrects minor but visible intra-nucleotide clash for the anti nucleobase conformation. We observed that structural ensembles of small RNA benchmark motifs simulated with the NBfix0BPh modification provide better agreement with experiments. No side-effects of the modification were observed in standard simulations of larger structured RNA motifs. We suggest that the combination of OL3 RNA ff and NBfix0BPh modification is a viable option to improve RNA MD simulations.

11

Artificial Intelligence Boosted Molecular Dynamics

Do, H. N.; Miao, Y.

2023-03-27 bioinformatics Community evaluation 10.1101/2023.03.25.534210 medRxiv

Top 0.1%

55.8%

Show abstract

We have developed a new Deep Boosted Molecular Dynamics (DBMD) method. Probabilistic Bayesian neural network models were implemented to construct boost potentials that exhibit Gaussian distribution with minimized anharmonicity, thereby allowing for accurate energetic reweighting and enhanced sampling of molecular simulations. DBMD was demonstrated on model systems of alanine dipeptide and the fast-folding protein and RNA structures. For alanine dipeptide, 30ns DMBD simulations captured up to 83-125 times more backbone dihedral transitions than 1{micro}s conventional molecular dynamics (cMD) simulations and were able to accurately reproduce the original free energy profiles. Moreover, DBMD sampled multiple folding and unfolding events within 300ns simulations of the chignolin model protein and identified low-energy conformational states comparable to previous simulation findings. Finally, DBMD captured a general folding pathway of three hairpin RNAs with the GCAA, GAAA, and UUCG tetraloops. Based on Deep Learning neural network, DBMD provides a powerful and generally applicable approach to boosting biomolecular simulations. DBMD is available with open source in OpenMM at https://github.com/MiaoLab20/DBMD/.

12

Exploring Conformational Transitions of RNA Dimers via Machine Learning Potentials

Medrano Sandonas, L.; Tolmos Nehme, M.; Cofas-Vargas, L. F.; Olivos-Ramirez, G. E.; Cuniberti, G.; Poblete, S.; Poma, A. B.

2026-02-26 biophysics 10.64898/2026.02.25.707885 medRxiv

Top 0.1%

55.4%

Show abstract

RNA is a flexible biopolymer that adopts diverse conformations while forming structural motifs essential for its function. Classical RNA force fields often show limited transferability and inefficient sampling of transitions between stable states, particularly in moderately large RNA. To address these limitations, quantum-informed machine learning (ML) potentials have recently emerged as a promising alternative, offering improved accuracy and transferability relative to classical force fields. Here, we assess ML potentials for exploring RNA conformations using the adenine-adenine dinucleoside monophosphate (ApA) dimer, a fundamental RNA building block. We generated an extensive quantum-mechanical (QM) dataset for ApA conformations obtained from temperature replica exchange molecular dynamics (TREMD) simulations. Despite its small size, the ApA dimer exhibits six conformations in which quantum effects and solvent-mediated interactions play crucial roles. Using this dataset, we parameterized ML potentials based on the equivariant MACE architecture and informed by both ab-initio and semi-empirical data. The resulting potentials reproduce key conformational features of the ApA system, including base stacking, sugar puckering, and backbone flexibility, and provide broader coverage of structural transitions than the general-purpose SO3LR and MACE-OFF24 models. These findings highlight the importance of quantum-accurate RNA force fields towards the structural and energetic characterization of RNA complexes.

13

Binding Affinity Estimation From Restrained Umbrella Sampling Simulations

Govind Kumar, V.; Agrawal, S.; Suresh Kumar, T. K.; Moradi, M.

2021-10-28 biophysics 10.1101/2021.10.28.466324 medRxiv

Top 0.1%

55.1%

Show abstract

The protein-ligand binding affinity quantifies the binding strength between a protein and its ligand. Computer modeling and simulations can be used to estimate the binding affinity or binding free energy using data- or physics-driven methods or a combination thereof. Here, we discuss a purely physics-based sampling approach based on biased molecular dynamics (MD) simulations, which in spirit is similar to the stratification strategy suggested previously by Woo and Roux. The proposed methodology uses umbrella sampling (US) simulations with additional restraints based on collective variables such as the orientation of the ligand. The novel extension of this strategy presented here uses a simplified and more general scheme that can be easily tailored for any system of interest. We estimate the binding affinity of human fibroblast growth factor 1 (hFGF1) to heparin hexasaccharide based on the available crystal structure of the complex as the initial model and four different variations of the proposed method to compare against the experimentally determined binding affinity obtained from isothermal calorimetry (ITC) experiments. Our results indicate that enhanced sampling methods that sample along the ligand-protein distance without restraining other degrees of freedom do not perform as well as those with additional restraint. In particular, restraining the orientation of the ligands plays a crucial role in reaching a reasonable estimate for binding affinity. The general framework presented here provides a flexible scheme for designing practical binding free energy estimation methods.

14

MOFF2: A Transferable Coarse-Grained Protein Force Field for Predictive Condensate Simulations

Liu, S.; Zhang, Y.; Riveros, I.; Wang, C.; Zhang, B.

2026-06-10 biophysics 10.64898/2026.06.10.731384 medRxiv

Top 0.1%

55.1%

Show abstract

Coarse-grained protein force fields enable simulations of biomolecular systems at length and time scales that are difficult to access with atomistic models, but achieving transferability across folded, intrinsically disordered, and multidomain proteins remains challenging. A central difficulty is that one-bead-per-residue models must represent chemically specific residue interactions while also absorbing solvent-mediated and many-body effects into a simplified energy function. Here, we present MOFF2, a transferable coarse-grained protein force field that combines residue-pair-specific interactions with a density-dependent many-body potential. MOFF2 is optimized using a two-stage strategy: bottom-up parameter learning from heterogeneous reference ensembles followed by refinement against experimental conformational observables. The resulting model provides balanced performance across ordered proteins, intrinsically disordered proteins, and multidomain proteins, and predicts condensate saturation-concentration trends for A1-LCD variant systems. Analysis of the learned parameters reveals chemically interpretable interaction patterns and density-dependent effects that explain the models improved transferability. These results demonstrate that combining a generalized coarse-grained energy function with data-driven optimization can produce a practical and interpretable force field for protein conformational and condensate simulations.

15

Benchmarking generative AI and physics based molecular simulation for sampling conformational heterogeneity in T4 Lysozyme

Bhakat, S.

2026-05-13 biophysics 10.64898/2026.05.10.724101 medRxiv

Top 0.1%

54.7%

Show abstract

Wild-type T4 lysozyme (T4L) is used as a benchmark to evaluate conformational sampling across generative AI, AI-accelerated molecular simulation (AMS), and physics-based enhanced molecular dynamics (EMD). A four-state model: exposed/open, exposed/closed, buried/open, and buried/closed; is defined using physically meaningful collective variables. While generative AI methods (AF-cluster, MSA subsampling of AlphaFold2, ConforFold, AlphaFlow, ESMFlow, ConfRover, BioEmu) largely sample only the exposed/open state, AMS integrating generative ensembles with iterative molecular dynamics, recovering all states and reproducing equilibrium populations similar to EMD and experimental smFRET signatures.

16

Towards Convergence in Folding Simulations of RNA Tetraloops: Comparison of Enhanced Sampling Techniques and Effects of Force Field Corrections

Mlynsky, V.; Janecek, M.; Kuhrova, P.; Frohlking, T.; Otyepka, M.; Bussi, G.; Banas, P.; Sponer, J.

2021-12-01 biophysics 10.1101/2021.11.30.470631 medRxiv

Top 0.1%

54.6%

Show abstract

Atomistic molecular dynamics (MD) simulations represent established technique for investigation of RNA structural dynamics. Despite continuous development, contemporary RNA simulations still suffer from suboptimal accuracy of empirical potentials (force fields, ffs) and sampling limitations. Development of efficient enhanced sampling techniques is important for two reasons. First, they allow to overcome the sampling limitations and, second, they can be used to quantify ff imbalances provided they reach a sufficient convergence. Here, we study two RNA tetraloops (TLs), namely the GAGA and UUCG motifs. We perform extensive folding simulations and calculate folding free energies ({Delta}Gfold) with the aim to compare different enhanced sampling techniques and to test several modifications of the nonbonded terms extending the AMBER OL3 RNA ff. We demonstrate that replica exchange solute tempering (REST2) simulations with 12-16 replicas do not show any sign of convergence even when extended to time scale of 120 s per replica. However, combination of REST2 with well-tempered metadynamics (ST-MetaD) achieves good convergence on a time-scale of 5-10 s per replica, improving the sampling efficiency by at least two orders of magnitude. Effects of ff modifications on {Delta}Gfold energies were initially explored by the reweighting approach and then validated by new simulations. We tested several manually-prepared variants of gHBfix potential which improve stability of the native state of both TLs by up to ~2 kcal/mol. This is sufficient to conveniently stabilize the folded GAGA TL while the UUCG TL still remains under-stabilized. Appropriate adjustment of van der Waals parameters for C-H...O5 base-phosphate interaction are also shown to be capable of further stabilizing the native states of both TLs by ~0.6 kcal/mol.

17

Exploring RNA conformational ensembles in silico: progress and challenges

Roeder, K.; Stirnemann, G.; Meuret, L.; Barquero-Morera, D.; Forget, S.; Wales, D. J.; Pasquali, S.

2026-02-18 molecular biology 10.64898/2026.02.18.706514 medRxiv

Top 0.1%

54.3%

Show abstract

RNA function is intrinsically linked to its structural polymorphism, with molecules exploring the heterogeneous conformational ensembles resulting from complex energy landscapes. These landscapes arise from competing interactions, small energetic separations between microstates, and strong coupling to the environment, posing significant challenges for both experimental characterization and molecular simulation. In this chapter, we review current computational strategies that aim to explore RNA conformational ensembles in silico, with a specific focus on energy landscape-based approaches and atomistic simulations. We discuss key limitations related to sampling efficiency, force-field accuracy, and ensemble analysis, and illustrate their impact through case studies on a self-cleaving ribozyme and an H-type pseudoknot. Finally, we highlight emerging directions, including closer integration with experimental data and the growing role of machine learning, which will probably reinforce the predictive power of in silico RNA energy landscape exploration.

18

Toward Accurate RNA Folding Thermodynamics: Evaluation of Enhanced Sampling Methods for Force Field Benchmarking

Kuehrova, P.; Mlynsky, V.; Otyepka, M.; Sponer, J.; Banas, P.

2026-01-21 biophysics 10.64898/2026.01.19.700441 medRxiv

Top 0.1%

52.8%

Show abstract

Biologically functional RNAs operate near marginal stability, and their rugged free-energy landscapes and profound structural dynamics - typically not captured by structural biology experiments - play decisive roles. Atomistic molecular dynamics (MD) simulations provide a unique means to characterize these features. However, the applicability of atomistic MD is currently limited by accessible simulation timescales and, most importantly, by force-field (FF) accuracy. Folding free energies ({Delta}G{degrees}fold) of small RNA motifs represent well-defined targets for quantitative benchmarking of RNA FFs. In practice, however, obtaining thermodynamic estimates that are sufficiently robust for direct comparison with experimental data remains highly challenging, even for small RNA systems, and many published studies rely on sampling that is not fully converged. Here, we systematically assess the performance of widely used advanced enhanced-sampling techniques using the 8-mer r(gcGAGAgc) tetraloop as a representative benchmark system. We test temperature replica exchange (T-REMD), two solute-tempering variants of replica exchange (REST2 and REHT), as well as well-tempered metadynamics and on-the-fly probability enhanced sampling combined with solute tempering (ST-MetaD and ST-OPES). Among the tested approaches, T-REMD proves to be the most robust, yielding reproducible folding equilibria and consistent estimates of {Delta}G{degrees}fold after approximately 20 s of simulation time, independent of the initial folded or unfolded conformational ensemble. Our results provide practical guidelines for selecting sampling protocols suitable for quantitative RNA benchmarks and lay the foundation for systematic validation and future refinement of RNA FFs.

19

Multi-Agent Reinforcement Learning-based Adaptive Sampling for Conformational Sampling of Proteins

Kleiman, D. E.; Shukla, D.

2022-05-31 biophysics 10.1101/2022.05.31.494208 medRxiv

Top 0.1%

52.8%

Show abstract

Machine Learning is increasingly applied to improve the efficiency and accuracy of Molecular Dynamics (MD) simulations. Although the growth of distributed computer clusters has allowed researchers to obtain higher amounts of data, unbiased MD simulations have difficulty sampling rare states, even under massively parallel adaptive sampling schemes. To address this issue, several algorithms inspired by reinforcement learning (RL) have arisen to promote exploration of the slow collective variables (CVs) of complex systems. Nonetheless, most of these algorithms are not well-suited to leverage the information gained by simultaneously sampling a system from different initial states (e.g., a protein in different conformations associated with distinct functional states). To fill this gap, we propose two algorithms inspired by multi-agent RL that extend the functionality of closely-related techniques (REAP and TSLC) to situations where the sampling can be accelerated by learning from different regions of the energy landscape through coordinated agents. Essentially, the algorithms work by remembering which agent discovered each conformation and sharing this information with others at the action-space discretization step. A stakes function is introduced to modulate how different agents sense rewards from discovered states of the system. The consequences are threefold: (i) agents learn to prioritize CVs using only relevant data, (ii) redundant exploration is reduced, and (iii) agents that obtain higher stakes are assigned more actions. We compare our algorithm with other adaptive sampling techniques (Least Counts, REAP, TSLC, and AdaptiveBandit) to show and rationalize the gain in performance.

20

Calculating Protein-Ligand Residence Times Through State Predictive Information Bottleneck based Enhanced Sampling

Lee, S.; Wang, D.; Seeliger, M.; Tiwary, P.

2024-04-20 biophysics 10.1101/2024.04.16.589710 medRxiv

Top 0.1%

52.8%

Show abstract

Understanding drug residence times in target proteins is key to improving drug efficacy and understanding target recognition in biochemistry. While drug residence time is just as important as binding affinity, atomiclevel understanding of drug residence times through molecular dynamics (MD) simulations has been difficult primarily due to the extremely long timescales. Recent advances in rare event sampling have allowed us to reach these timescales, yet predicting protein-ligand residence times remains a significant challenge. Here we present a semi-automated protocol to calculate the ligand residence times across 12 orders of magnitudes of timescales. In our proposed framework, we integrate a deep learning-based method, the state predictive information bottleneck (SPIB), to learn an approximate reaction coordinate (RC) and use it to guide the enhanced sampling method metadynamics. We demonstrate the performance of our algorithm by applying it to six different protein-ligand complexes with available benchmark residence times, including the dissociation of the widely studied anti-cancer drug Imatinib (Gleevec) from both wild-type Abl kinase and drug-resistant mutants. We show how our protocol can recover quantitatively accurate residence times, potentially opening avenues for deeper insights into drug development possibilities and ligand recognition mechanisms. TOC Graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=107 SRC="FIGDIR/small/589710v1_ufig1.gif" ALT="Figure 1"> View larger version (27K): org.highwire.dtl.DTLVardef@11dc13borg.highwire.dtl.DTLVardef@79103dorg.highwire.dtl.DTLVardef@194b67org.highwire.dtl.DTLVardef@a570e3_HPS_FORMAT_FIGEXP M_FIG C_FIG